Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 100 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 12.6 KiB |
| Average record size in memory | 129.3 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 2 |
| Unsupported | 6 |
Country has a high cardinality: 100 distinct values | High cardinality |
New Cases is highly correlated with New Deaths and 1 other fields | High correlation |
New Deaths is highly correlated with New Cases and 1 other fields | High correlation |
New Recovered is highly correlated with New Cases and 1 other fields | High correlation |
df_index is highly correlated with New Cases and 2 other fields | High correlation |
New Cases is highly correlated with df_index and 2 other fields | High correlation |
New Deaths is highly correlated with df_index and 2 other fields | High correlation |
New Recovered is highly correlated with df_index and 2 other fields | High correlation |
df_index is highly correlated with New Cases and 2 other fields | High correlation |
New Cases is highly correlated with df_index and 2 other fields | High correlation |
New Deaths is highly correlated with df_index and 2 other fields | High correlation |
New Recovered is highly correlated with df_index and 2 other fields | High correlation |
Active Cases is highly correlated with Continent and 6 other fields | High correlation |
df_index is highly correlated with Country | High correlation |
Continent is highly correlated with Active Cases and 7 other fields | High correlation |
Country is highly correlated with Active Cases and 8 other fields | High correlation |
New Deaths is highly correlated with Active Cases and 6 other fields | High correlation |
New Cases is highly correlated with Active Cases and 6 other fields | High correlation |
Total Recovered is highly correlated with Active Cases and 6 other fields | High correlation |
Total Cases/1M is highly correlated with Continent and 1 other fields | High correlation |
Total Cases is highly correlated with Active Cases and 6 other fields | High correlation |
New Recovered is highly correlated with Active Cases and 6 other fields | High correlation |
Continent is highly correlated with Country | High correlation |
Country is highly correlated with Continent | High correlation |
Country is uniformly distributed | Uniform |
df_index has unique values | Unique |
Country has unique values | Unique |
Total Cases has unique values | Unique |
Total Recovered has unique values | Unique |
Total Cases/1M has unique values | Unique |
Total Deaths is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Serious/Critical is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Deaths/1M is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Total Tests is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Test/1M is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Population is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2021-07-22 13:14:08.951251 |
|---|---|
| Analysis finished | 2021-07-22 13:20:57.130528 |
| Duration | 6 minutes and 48.18 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 100 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 115.15 |
| Minimum | 5 |
|---|---|
| Maximum | 211 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 22.45 |
| Q1 | 60.75 |
| median | 119.5 |
| Q3 | 172.5 |
| 95-th percentile | 203.05 |
| Maximum | 211 |
| Range | 206 |
| Interquartile range (IQR) | 111.75 |
Descriptive statistics
| Standard deviation | 60.93204942 |
|---|---|
| Coefficient of variation (CV) | 0.5291537075 |
| Kurtosis | -1.236516201 |
| Mean | 115.15 |
| Median Absolute Deviation (MAD) | 56 |
| Skewness | -0.08298493528 |
| Sum | 11515 |
| Variance | 3712.714646 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 211 | 1 | 1.0% |
| 85 | 1 | 1.0% |
| 63 | 1 | 1.0% |
| 64 | 1 | 1.0% |
| 66 | 1 | 1.0% |
| 68 | 1 | 1.0% |
| 71 | 1 | 1.0% |
| 77 | 1 | 1.0% |
| 78 | 1 | 1.0% |
| 80 | 1 | 1.0% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 5 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 23 | 1 | |
| 24 | 1 | |
| 26 | 1 | |
| 29 | 1 | |
| 30 | 1 |
| Value | Count | Frequency (%) |
| 211 | 1 | |
| 210 | 1 | |
| 208 | 1 | |
| 205 | 1 | |
| 204 | 1 | |
| 203 | 1 | |
| 202 | 1 | |
| 200 | 1 | |
| 199 | 1 | |
| 198 | 1 |
| Distinct | 100 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.0 B |
| Zambia | 1 |
|---|---|
| Western Sahara | 1 |
| French Guiana | 1 |
| Belarus | 1 |
| Bulgaria | 1 |
| Other values (95) |
Length
| Max length | 22 |
|---|---|
| Median length | 7.5 |
| Mean length | 8.82 |
| Min length | 3 |
Characters and Unicode
| Total characters | 882 |
|---|---|
| Distinct characters | 52 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 100 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Philippines |
|---|---|
| 2nd row | Guinea-Bissau |
| 3rd row | Guinea |
| 4th row | Ghana |
| 5th row | Trinidad and Tobago |
Common Values
| Value | Count | Frequency (%) |
| Zambia | 1 | 1.0% |
| Western Sahara | 1 | 1.0% |
| French Guiana | 1 | 1.0% |
| Belarus | 1 | 1.0% |
| Bulgaria | 1 | 1.0% |
| Panama | 1 | 1.0% |
| Gabon | 1 | 1.0% |
| Morocco | 1 | 1.0% |
| Total: | 1 | 1.0% |
| Guinea-Bissau | 1 | 1.0% |
| Other values (90) | 90 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| and | 5 | 3.8% |
| islands | 4 | 3.0% |
| new | 3 | 2.3% |
| saint | 2 | 1.5% |
| guinea | 2 | 1.5% |
| french | 2 | 1.5% |
| gabon | 1 | 0.8% |
| dominican | 1 | 0.8% |
| ireland | 1 | 0.8% |
| hungary | 1 | 0.8% |
| Other values (111) | 111 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 146 | |
| i | 81 | 9.2% |
| n | 69 | 7.8% |
| e | 53 | 6.0% |
| r | 49 | 5.6% |
| o | 45 | 5.1% |
| s | 38 | 4.3% |
| 34 | 3.9% | |
| t | 34 | 3.9% |
| l | 30 | 3.4% |
| Other values (42) | 303 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 711 | |
| Uppercase Letter | 134 | 15.2% |
| Space Separator | 34 | 3.9% |
| Other Punctuation | 2 | 0.2% |
| Dash Punctuation | 1 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 146 | |
| i | 81 | |
| n | 69 | |
| e | 53 | 7.5% |
| r | 49 | 6.9% |
| o | 45 | 6.3% |
| s | 38 | 5.3% |
| t | 34 | 4.8% |
| l | 30 | 4.2% |
| u | 30 | 4.2% |
| Other values (16) | 136 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 14 | 10.4% |
| B | 13 | 9.7% |
| M | 13 | 9.7% |
| S | 12 | 9.0% |
| G | 10 | 7.5% |
| A | 9 | 6.7% |
| P | 6 | 4.5% |
| T | 6 | 4.5% |
| I | 6 | 4.5% |
| N | 6 | 4.5% |
| Other values (12) | 39 |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 1 | |
| . | 1 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1 |
Space Separator
| Value | Count | Frequency (%) |
| 34 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 845 | |
| Common | 37 | 4.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 146 | |
| i | 81 | 9.6% |
| n | 69 | 8.2% |
| e | 53 | 6.3% |
| r | 49 | 5.8% |
| o | 45 | 5.3% |
| s | 38 | 4.5% |
| t | 34 | 4.0% |
| l | 30 | 3.6% |
| u | 30 | 3.6% |
| Other values (38) | 270 |
Common
| Value | Count | Frequency (%) |
| 34 | ||
| - | 1 | 2.7% |
| : | 1 | 2.7% |
| . | 1 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 881 | |
| Latin 1 Sup | 1 | 0.1% |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 146 | |
| i | 81 | 9.2% |
| n | 69 | 7.8% |
| e | 53 | 6.0% |
| r | 49 | 5.6% |
| o | 45 | 5.1% |
| s | 38 | 4.3% |
| 34 | 3.9% | |
| t | 34 | 3.9% |
| l | 30 | 3.4% |
| Other values (41) | 302 |
Latin 1 Sup
| Value | Count | Frequency (%) |
| ç | 1 |
| Distinct | 100 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2382947.66 |
| Minimum | 10 |
|---|---|
| Maximum | 192795147 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 303.85 |
| Q1 | 6253.25 |
| median | 46239.5 |
| Q3 | 382756 |
| 95-th percentile | 4242736.55 |
| Maximum | 192795147 |
| Range | 192795137 |
| Interquartile range (IQR) | 376502.75 |
Descriptive statistics
| Standard deviation | 19267189.56 |
|---|---|
| Coefficient of variation (CV) | 8.085443875 |
| Kurtosis | 99.28212616 |
| Mean | 2382947.66 |
| Median Absolute Deviation (MAD) | 45982 |
| Skewness | 9.947439574 |
| Sum | 238294766 |
| Variance | 3.712245936 × 1014 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3070 | 1 | 1.0% |
| 566356 | 1 | 1.0% |
| 81467 | 1 | 1.0% |
| 138300 | 1 | 1.0% |
| 42558 | 1 | 1.0% |
| 5911601 | 1 | 1.0% |
| 295746 | 1 | 1.0% |
| 25309 | 1 | 1.0% |
| 5594 | 1 | 1.0% |
| 6473 | 1 | 1.0% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 10 | 1 | |
| 56 | 1 | |
| 131 | 1 | |
| 164 | 1 | |
| 206 | 1 | |
| 309 | 1 | |
| 557 | 1 | |
| 627 | 1 | |
| 949 | 1 | |
| 1005 | 1 |
| Value | Count | Frequency (%) |
| 192795147 | 1 | |
| 6030240 | 1 | |
| 5911601 | 1 | |
| 4798851 | 1 | |
| 4679994 | 1 | |
| 4219723 | 1 | |
| 1602854 | 1 | |
| 1524438 | 1 | |
| 1424715 | 1 | |
| 1096341 | 1 |
| Distinct | 71 |
|---|---|
| Distinct (%) | 71.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7114.43 |
| Minimum | -1 |
|---|---|
| Maximum | 556432 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 26 |
| Negative (%) | 26.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | -1 |
| median | 95.5 |
| Q3 | 798.5 |
| 95-th percentile | 13083.5 |
| Maximum | 556432 |
| Range | 556433 |
| Interquartile range (IQR) | 799.5 |
Descriptive statistics
| Standard deviation | 55698.03846 |
|---|---|
| Coefficient of variation (CV) | 7.828882772 |
| Kurtosis | 98.44422689 |
| Mean | 7114.43 |
| Median Absolute Deviation (MAD) | 96.5 |
| Skewness | 9.8870993 |
| Sum | 711443 |
| Variance | 3102271488 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -1 | 26 | |
| 4 | 2 | 2.0% |
| 32 | 2 | 2.0% |
| 1 | 2 | 2.0% |
| 23 | 2 | 2.0% |
| 21539 | 1 | 1.0% |
| 79 | 1 | 1.0% |
| 26 | 1 | 1.0% |
| 1142 | 1 | 1.0% |
| 431 | 1 | 1.0% |
| Other values (61) | 61 |
| Value | Count | Frequency (%) |
| -1 | 26 | |
| 1 | 2 | 2.0% |
| 2 | 1 | 1.0% |
| 4 | 2 | 2.0% |
| 5 | 1 | 1.0% |
| 6 | 1 | 1.0% |
| 7 | 1 | 1.0% |
| 8 | 1 | 1.0% |
| 9 | 1 | 1.0% |
| 10 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 556432 | 1 | |
| 30587 | 1 | |
| 23704 | 1 | |
| 21539 | 1 | |
| 14632 | 1 | |
| 13002 | 1 | |
| 11244 | 1 | |
| 6549 | 1 | |
| 3940 | 1 | |
| 2705 | 1 |
| Distinct | 24 |
|---|---|
| Distinct (%) | 24.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 107.89 |
| Minimum | -1 |
|---|---|
| Maximum | 8640 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 55 |
| Negative (%) | 55.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | -1 |
| median | -1 |
| Q3 | 4.75 |
| 95-th percentile | 64.3 |
| Maximum | 8640 |
| Range | 8641 |
| Interquartile range (IQR) | 5.75 |
Descriptive statistics
| Standard deviation | 867.1378893 |
|---|---|
| Coefficient of variation (CV) | 8.037240609 |
| Kurtosis | 97.50289946 |
| Mean | 107.89 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 9.822419596 |
| Sum | 10789 |
| Variance | 751928.1191 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=24)
| Value | Count | Frequency (%) |
| -1 | 55 | |
| 1 | 7 | 7.0% |
| 3 | 6 | 6.0% |
| 2 | 5 | 5.0% |
| 32 | 2 | 2.0% |
| 26 | 2 | 2.0% |
| 11 | 2 | 2.0% |
| 18 | 2 | 2.0% |
| 12 | 2 | 2.0% |
| 4 | 2 | 2.0% |
| Other values (14) | 15 | 15.0% |
| Value | Count | Frequency (%) |
| -1 | 55 | |
| 1 | 7 | 7.0% |
| 2 | 5 | 5.0% |
| 3 | 6 | 6.0% |
| 4 | 2 | 2.0% |
| 7 | 1 | 1.0% |
| 9 | 2 | 2.0% |
| 10 | 1 | 1.0% |
| 11 | 2 | 2.0% |
| 12 | 2 | 2.0% |
| Value | Count | Frequency (%) |
| 8640 | 1 | |
| 783 | 1 | |
| 437 | 1 | |
| 351 | 1 | |
| 108 | 1 | |
| 62 | 1 | |
| 52 | 1 | |
| 42 | 1 | |
| 40 | 1 | |
| 32 | 2 |
| Distinct | 100 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2175618.94 |
| Minimum | 8 |
|---|---|
| Maximum | 175323677 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 261.5 |
| Q1 | 4950.75 |
| median | 38202 |
| Q3 | 328015.25 |
| 95-th percentile | 3715936.75 |
| Maximum | 175323677 |
| Range | 175323669 |
| Interquartile range (IQR) | 323064.5 |
Descriptive statistics
| Standard deviation | 17521399.74 |
|---|---|
| Coefficient of variation (CV) | 8.053524178 |
| Kurtosis | 99.2561387 |
| Mean | 2175618.94 |
| Median Absolute Deviation (MAD) | 38024 |
| Skewness | 9.945546776 |
| Sum | 217561894 |
| Variance | 3.069994487 × 1014 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4431871 | 1 | 1.0% |
| 293208 | 1 | 1.0% |
| 13122 | 1 | 1.0% |
| 1073475 | 1 | 1.0% |
| 175429 | 1 | 1.0% |
| 407622 | 1 | 1.0% |
| 304456 | 1 | 1.0% |
| 588 | 1 | 1.0% |
| 30266 | 1 | 1.0% |
| 13907 | 1 | 1.0% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 8 | 1 | |
| 53 | 1 | |
| 58 | 1 | |
| 161 | 1 | |
| 195 | 1 | |
| 265 | 1 | |
| 462 | 1 | |
| 514 | 1 | |
| 588 | 1 | |
| 618 | 1 |
| Value | Count | Frequency (%) |
| 175323677 | 1 | |
| 5666440 | 1 | |
| 5404797 | 1 | |
| 4435550 | 1 | |
| 4431871 | 1 | |
| 3678256 | 1 | |
| 1557199 | 1 | |
| 1449556 | 1 | |
| 1393466 | 1 | |
| 1073475 | 1 |
| Distinct | 62 |
|---|---|
| Distinct (%) | 62.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4882.02 |
| Minimum | -1 |
|---|---|
| Maximum | 397019 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 34 |
| Negative (%) | 34.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | -1 |
| Q1 | -1 |
| median | 45 |
| Q3 | 658.5 |
| 95-th percentile | 5486.35 |
| Maximum | 397019 |
| Range | 397020 |
| Interquartile range (IQR) | 659.5 |
Descriptive statistics
| Standard deviation | 39717.45157 |
|---|---|
| Coefficient of variation (CV) | 8.135454499 |
| Kurtosis | 98.88731471 |
| Mean | 4882.02 |
| Median Absolute Deviation (MAD) | 46 |
| Skewness | 9.919426309 |
| Sum | 488202 |
| Variance | 1577475959 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -1 | 34 | |
| 1 | 2 | 2.0% |
| 64 | 2 | 2.0% |
| 45 | 2 | 2.0% |
| 8 | 2 | 2.0% |
| 4 | 2 | 2.0% |
| 10876 | 1 | 1.0% |
| 11 | 1 | 1.0% |
| 157 | 1 | 1.0% |
| 1024 | 1 | 1.0% |
| Other values (52) | 52 |
| Value | Count | Frequency (%) |
| -1 | 34 | |
| 1 | 2 | 2.0% |
| 4 | 2 | 2.0% |
| 8 | 2 | 2.0% |
| 11 | 1 | 1.0% |
| 12 | 1 | 1.0% |
| 14 | 1 | 1.0% |
| 16 | 1 | 1.0% |
| 19 | 1 | 1.0% |
| 20 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 397019 | 1 | |
| 22584 | 1 | |
| 12684 | 1 | |
| 10876 | 1 | |
| 8248 | 1 | |
| 5341 | 1 | |
| 2631 | 1 | |
| 2196 | 1 | |
| 1933 | 1 | |
| 1905 | 1 |
| Distinct | 97 |
|---|---|
| Distinct (%) | 97.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 156366.43 |
| Minimum | 1 |
|---|---|
| Maximum | 13329448 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 7 |
| Q1 | 258.75 |
| median | 1964.5 |
| Q3 | 13555.25 |
| 95-th percentile | 131520.8 |
| Maximum | 13329448 |
| Range | 13329447 |
| Interquartile range (IQR) | 13296.5 |
Descriptive statistics
| Standard deviation | 1332608.558 |
|---|---|
| Coefficient of variation (CV) | 8.522344326 |
| Kurtosis | 99.38438547 |
| Mean | 156366.43 |
| Median Absolute Deviation (MAD) | 1947.5 |
| Skewness | 9.955096551 |
| Sum | 15636643 |
| Variance | 1.775845568 × 1012 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 7 | 3 | 3.0% |
| 56 | 2 | 2.0% |
| 20392 | 1 | 1.0% |
| 3102 | 1 | 1.0% |
| 159 | 1 | 1.0% |
| 1913 | 1 | 1.0% |
| 3998 | 1 | 1.0% |
| 290 | 1 | 1.0% |
| 25078 | 1 | 1.0% |
| 537 | 1 | 1.0% |
| Other values (87) | 87 |
| Value | Count | Frequency (%) |
| 1 | 1 | 1.0% |
| 2 | 1 | 1.0% |
| 3 | 1 | 1.0% |
| 7 | 3 | |
| 8 | 1 | 1.0% |
| 11 | 1 | 1.0% |
| 16 | 1 | 1.0% |
| 18 | 1 | 1.0% |
| 27 | 1 | 1.0% |
| 36 | 1 | 1.0% |
| Value | Count | Frequency (%) |
| 13329448 | 1 | |
| 474738 | 1 | |
| 460301 | 1 | |
| 264162 | 1 | |
| 133607 | 1 | |
| 131411 | 1 | |
| 126962 | 1 | |
| 71011 | 1 | |
| 51529 | 1 | |
| 51521 | 1 |
| Distinct | 100 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38220.528 |
| Minimum | 16 |
|---|---|
| Maximum | 149944 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 928.0 B |
Quantile statistics
| Minimum | 16 |
|---|---|
| 5-th percentile | 522.3 |
| Q1 | 2698.75 |
| median | 21176 |
| Q3 | 71913.75 |
| 95-th percentile | 101610.05 |
| Maximum | 149944 |
| Range | 149928 |
| Interquartile range (IQR) | 69215 |
Descriptive statistics
| Standard deviation | 38228.98631 |
|---|---|
| Coefficient of variation (CV) | 1.000221303 |
| Kurtosis | -0.7044833684 |
| Mean | 38220.528 |
| Median Absolute Deviation (MAD) | 20495.5 |
| Skewness | 0.6772213828 |
| Sum | 3822052.8 |
| Variance | 1461455394 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 37404 | 1 | 1.0% |
| 1497 | 1 | 1.0% |
| 14951 | 1 | 1.0% |
| 80276 | 1 | 1.0% |
| 90215 | 1 | 1.0% |
| 56636 | 1 | 1.0% |
| 11097 | 1 | 1.0% |
| 2314 | 1 | 1.0% |
| 2449 | 1 | 1.0% |
| 2854 | 1 | 1.0% |
| Other values (90) | 90 |
| Value | Count | Frequency (%) |
| 16 | 1 | |
| 85 | 1 | |
| 223 | 1 | |
| 454 | 1 | |
| 509 | 1 | |
| 523 | 1 | |
| 565 | 1 | |
| 630 | 1 | |
| 662 | 1 | |
| 699 | 1 |
| Value | Count | Frequency (%) |
| 149944 | 1 | |
| 124276 | 1 | |
| 108884 | 1 | |
| 107849 | 1 | |
| 105164 | 1 | |
| 101423 | 1 | |
| 99481 | 1 | |
| 98807 | 1 | |
| 97034 | 1 | |
| 95456 | 1 |
| Distinct | 7 |
|---|---|
| Distinct (%) | 7.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 928.0 B |
| Africa | |
|---|---|
| Europe | |
| North America | |
| Asia | |
| South America | |
| Other values (2) |
Length
| Max length | 17 |
|---|---|
| Median length | 6 |
| Mean length | 8.15 |
| Min length | 3 |
Characters and Unicode
| Total characters | 815 |
|---|---|
| Distinct characters | 22 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | Asia |
|---|---|
| 2nd row | Africa |
| 3rd row | Africa |
| 4th row | Africa |
| 5th row | North America |
Common Values
| Value | Count | Frequency (%) |
| Africa | 29 | |
| Europe | 25 | |
| North America | 20 | |
| Asia | 13 | |
| South America | 7 | 7.0% |
| Australia/Oceania | 5 | 5.0% |
| All | 1 | 1.0% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| africa | 29 | |
| america | 27 | |
| europe | 25 | |
| north | 20 | |
| asia | 13 | |
| south | 7 | 5.5% |
| australia/oceania | 5 | 3.9% |
| all | 1 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 106 | |
| a | 89 | |
| i | 79 | 9.7% |
| A | 75 | 9.2% |
| c | 61 | 7.5% |
| e | 57 | 7.0% |
| o | 52 | 6.4% |
| u | 37 | 4.5% |
| t | 32 | 3.9% |
| f | 29 | 3.6% |
| Other values (12) | 198 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 651 | |
| Uppercase Letter | 132 | 16.2% |
| Space Separator | 27 | 3.3% |
| Other Punctuation | 5 | 0.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 106 | |
| a | 89 | |
| i | 79 | |
| c | 61 | |
| e | 57 | |
| o | 52 | |
| u | 37 | 5.7% |
| t | 32 | 4.9% |
| f | 29 | 4.5% |
| h | 27 | 4.1% |
| Other values (5) | 82 |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 75 | |
| E | 25 | 18.9% |
| N | 20 | 15.2% |
| S | 7 | 5.3% |
| O | 5 | 3.8% |
Space Separator
| Value | Count | Frequency (%) |
| 27 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 5 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 783 | |
| Common | 32 | 3.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 106 | |
| a | 89 | |
| i | 79 | |
| A | 75 | |
| c | 61 | 7.8% |
| e | 57 | 7.3% |
| o | 52 | 6.6% |
| u | 37 | 4.7% |
| t | 32 | 4.1% |
| f | 29 | 3.7% |
| Other values (10) | 166 |
Common
| Value | Count | Frequency (%) |
| 27 | ||
| / | 5 | 15.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 815 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 106 | |
| a | 89 | |
| i | 79 | 9.7% |
| A | 75 | 9.2% |
| c | 61 | 7.5% |
| e | 57 | 7.0% |
| o | 52 | 6.4% |
| u | 37 | 4.5% |
| t | 32 | 3.9% |
| f | 29 | 3.6% |
| Other values (12) | 198 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | Country | Total Cases | New Cases | Total Deaths | New Deaths | Total Recovered | New Recovered | Active Cases | Serious/Critical | Total Cases/1M | Deaths/1M | Total Tests | Test/1M | Population | Continent | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 24 | Philippines | 1524438 | 6549.0 | 26874 | 32.0 | 1449556 | 5341.0 | 48008 | 2031 | 13720 | 242 | 16141903 | 145274 | 111113614 | Asia |
| 1 | 179 | Guinea-Bissau | 4108 | 21.0 | 74 | 1.0 | 3744 | 19.0 | 290 | 4 | 2037 | 37 | 78087 | 38719 | 2016751 | Africa |
| 2 | 131 | Guinea | 24823 | 58.0 | 195 | 1.0 | 23719 | 64.0 | 909 | 24 | 1838 | 14 | 489365 | 36234 | 13505815 | Africa |
| 3 | 102 | Ghana | 99160 | -1.0 | 815 | -1.0 | 95221 | -1.0 | 3124 | 12 | 3123 | 26 | 1357389 | 42748 | 31753220 | Africa |
| 4 | 123 | Trinidad and Tobago | 36626 | 272.0 | 1003 | 3.0 | 29841 | 178.0 | 5782 | 22 | 26082 | 714 | 256269 | 182494 | 1404260 | North America |
| 5 | 119 | Malawi | 45465 | 781.0 | 1389 | 26.0 | 35432 | 229.0 | 8644 | 285 | 2314 | 71 | 309766 | 15766 | 19647757 | Africa |
| 6 | 198 | Faeroe Islands | 949 | 8.0 | 1 | -1.0 | 857 | 16.0 | 91 | -1 | 19344 | 20 | 351544 | 7165885 | 49058 | Europe |
| 7 | 55 | Bulgaria | 423319 | 96.0 | 18187 | 3.0 | 397831 | 183.0 | 7301 | 80 | 61411 | 2638 | 3469551 | 503325 | 6893263 | Europe |
| 8 | 80 | Armenia | 227936 | 220.0 | 4573 | 1.0 | 218585 | 56.0 | 4778 | -1 | 76770 | 1540 | 1282345 | 431900 | 2969077 | Asia |
| 9 | 142 | Papua New Guinea | 17524 | 9.0 | 192 | 1.0 | 17173 | 20.0 | 159 | 7 | 1920 | 21 | 143569 | 15734 | 9124865 | Australia/Oceania |
Last rows
| df_index | Country | Total Cases | New Cases | Total Deaths | New Deaths | Total Recovered | New Recovered | Active Cases | Serious/Critical | Total Cases/1M | Deaths/1M | Total Tests | Test/1M | Population | Continent | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 90 | 78 | Moldova | 258237 | 141.0 | 6232 | 3.0 | 251125 | 92.0 | 880 | 57 | 64173 | 1549 | 1359856 | 337930 | 4024080 | Europe |
| 91 | 183 | Liechtenstein | 3070 | 1.0 | 59 | -1.0 | 2995 | 1.0 | 16 | 2 | 80276 | 1543 | 49126 | 1284575 | 38243 | Europe |
| 92 | 85 | Zambia | 188573 | 971.0 | 3162 | 24.0 | 175429 | 701.0 | 9982 | 735 | 9965 | 167 | 2030499 | 107303 | 18922971 | Africa |
| 93 | 63 | Dominican Republic | 338291 | 316.0 | 3929 | 1.0 | 282841 | 2196.0 | 51521 | 246 | 30859 | 358 | 1803305 | 164501 | 10962301 | North America |
| 94 | 182 | Mauritius | 3120 | -1.0 | 19 | -1.0 | 1854 | -1.0 | 1247 | -1 | 2449 | 15 | 358675 | 281537 | 1273991 | Africa |
| 95 | 189 | Turks and Caicos | 2458 | -1.0 | 18 | -1.0 | 2404 | -1.0 | 36 | -1 | 62597 | 458 | 94789 | 2413961 | 39267 | North America |
| 96 | 51 | Paraguay | 447146 | 879.0 | 14446 | 52.0 | 407622 | 1342.0 | 25078 | 433 | 61892 | 2000 | 1640515 | 227073 | 7224621 | South America |
| 97 | 66 | Denmark | 308615 | 851.0 | 2542 | -1.0 | 294559 | 653.0 | 11514 | 12 | 53084 | 437 | 74244457 | 12770647 | 5813680 | Europe |
| 98 | 30 | Romania | 1081875 | 102.0 | 34260 | 2.0 | 1046952 | 71.0 | 663 | 38 | 56636 | 1793 | 10285158 | 538425 | 19102321 | Europe |
| 99 | 90 | Kyrgyzstan | 152709 | 1102.0 | 2205 | 9.0 | 128905 | 1213.0 | 21599 | 195 | 23008 | 332 | 1463686 | 220527 | 6637212 | Asia |